COVID-19 MAPPING

Felipe Solares

24/04/2020

About

This is a project using Python 3.7 developed by Felipe Solares da Silva. This is part of his professional portfolio and if you want to see more projects like this, go and check my portfolio at https://github.com/fsolares/professional-portfolio.

Contact: solares.fs@gmail.com


COVID - 19 Interactive Map

Project Purpose

Build interactive maps in order to understand COVID-19 occurences in Brazilian territory.


Step 1 - Installing and Importing Essential Packages and Modules

To this project, we're going to use only three libraries: geopandas, pandas and folium. Pandas is an old friend for all Data Scientist, so I'm assuming that you already have it installed in your machine. To install the other packages, just run the code below.

In [ ]:
!pip install folium
!pip install geopandas

It was a pretty good exercise to try to install geopandas on Windows OS. If you're using this OS as well and for some reason, stumble in the same rocks that I did, follow the steps in this wonderful tutorial: https://geoffboeing.com/2014/09/using-geopandas-windows/. So, if you followed all procedures above, you are ready to proceed to the next cell.

In [1]:
import geopandas as gpd
import pandas as pd
import folium
from folium.map import *
from folium import plugins
from folium.plugins import MeasureControl
from folium.plugins import FloatImage
from branca.colormap import LinearColormap

Step 2 - Preparing the Data

For this project, we're going to use two main files. First, a geojson file downloaded from EXPLORATORY site (https://exploratory.io/map). Geojson is based on the JavaScript Object Notation (JSON) and it is used to encoding a variety of geographic data structures.The features include points (therefore addresses and locations), line strings (therefore streets, highways and boundaries), polygons (countries, provinces, tracts of land), and multi-part collections of these types. Here is an example of a geojson structure:

{ "type": "FeatureCollection", "features": [ { "type": "Feature", "geometry": { "type": "Polygon", "coordinates": [ [ [100.0, 0.0], [101.0, 0.0], [101.0, 1.0], [100.0, 1.0], [100.0, 0.0] ] ] } }

And second, a CSV file created from Brazilian Health Ministry Site (https://covid.saude.gov.br/) data. The site compiles all reported information from all brazilian states such as: incidence, confirmed cases, confirmed deaths and mortality. They provide a CSV file, daily updated, since the first COVID occurrence. After lots of cleaning and transforming, we structure the data and store it into a new CSV file (that you can find in this repository!) for our futher analysis.

Load and Transform

Let's use the read_file function, from geopandas package, to load the br_states.geojson file and then transform the data.

In [2]:
geobr = gpd.read_file('br_states.geojson')

# Deleting columns.

del geobr['id']
del geobr['regiao_id']
del geobr['codigo_ibg']

# Renaming Columns.

geobr.columns = ['state', 'initials', 'geometry']

# Checking the data.

geobr.head()
Out[2]:
state initials geometry
0 Acre AC POLYGON ((-73.60818 -7.20194, -72.86963 -7.528...
1 Alagoas AL POLYGON ((-35.46516 -8.82549, -35.46626 -8.827...
2 Amazonas AM POLYGON ((-67.32553 2.03012, -67.32234 2.01741...
3 Amapá AP POLYGON ((-51.18168 4.00889, -51.17900 3.99812...
4 Bahia BA POLYGON ((-39.28820 -8.56290, -39.28229 -8.567...

Now, let's drop some unnecessary columns to ease our future analysis and load BRnCov19_10052020.csv using read_csv function. This data set contains cummulative information about brazilian ocurrences from day one till May/10. So, the goal here is to extract May/10 portion from the data and prepare it for merging.

In [3]:
sus = pd.read_csv('../SUS_csv/BRnCov19_10052020.csv', sep=';'
                  , usecols=['estado', 'data', 'casosAcumulados', 'obitosAcumulados'])

# 1 - Renaming all selected columns.

sus.columns = ['initials', 'date', 'cumcases', 'cumdeaths']

# 2 - Changing date column data type to datetime.

sus['date'] = pd.to_datetime(sus['date'])

# 3 - Extract May/10 portion.

sus.set_index('date', inplace=True)
sus = sus.loc['2020-05-10']
sus.reset_index(inplace=True)

# 4 - Merging geobr and sus data frames.

br = geobr.merge(sus, on='initials')

# 5 - Deleting date column.

del br['date']

# 6 - Checking the data.

br.head()
Out[3]:
state initials geometry cumcases cumdeaths
0 Acre AC POLYGON ((-73.60818 -7.20194, -72.86963 -7.528... 1447 41
1 Alagoas AL POLYGON ((-35.46516 -8.82549, -35.46626 -8.827... 2258 126
2 Amazonas AM POLYGON ((-67.32553 2.03012, -67.32234 2.01741... 12599 1004
3 Amapá AP POLYGON ((-51.18168 4.00889, -51.17900 3.99812... 2613 72
4 Bahia BA POLYGON ((-39.28820 -8.56290, -39.28229 -8.567... 5558 202

After the merging process, the br data frameis ready to next step. Let's run a statistical analysis using the function describe() to gather important metrics that will help in futher evaluations.

In [4]:
br.cumcases.describe()
Out[4]:
count       27.000000
mean      6025.888889
std       9249.856616
min        362.000000
25%       1374.500000
50%       2542.000000
75%       6407.000000
max      45444.000000
Name: cumcases, dtype: float64
In [5]:
br.cumdeaths.describe()
Out[5]:
count      27.000000
mean      411.962963
std       789.300541
min        11.000000
25%        42.000000
50%        97.000000
75%       290.500000
max      3709.000000
Name: cumdeaths, dtype: float64

Step 3 - Initialising Cummulatitive Cases Map

In order to make a dynamic map, we use the function folium.Map to initiate based on the center of my geographic regions. Brazil is located at latitude -14.235004 and longitude -51.92528 (https://www.geodatos.net/en/coordinates/brazil) and it is part of South America in the southern hemisphere. So, we're going to set our center using this values.

In [40]:
# Defining coordinates of where we want to center our map

c = [-14.235004, -51.925282]

#Creating the map

cumcasesmap = folium.Map(width= 600, heigth= 400, location = c, zoom_start = 4, max_zoom= 5, tiles= 'cartodbpositron')

Creating a Colormap

Colormap is some sort of layer to place colors in our geographic regions. First, we need to use branca LinearColormap to create a colormap, which is a linear interpretation of two or more colors. The branca colormap can be created based of tuples of RBGs or shortcuts. In my map, I’ll color from white to purple using an array with these colors. As its name implies, the colormap maps colors to numbers so we need to set the endpoints of the map to the minimum and maximum of our variable. Did you remember when we run the describe() function in the previous step? Using min and max metrics will be able to set the required boundaries.

In [41]:
# Creating a Colormap for Cumulative Cases

colormap = LinearColormap(colors= ['white', 'lightblue', 'purple'],
                           index= [362, 4000, 45444], vmin=362, vmax=45444)

colormap.caption = 'COVID-19 Cummulative Cases May/10'

Adding Layers

Now that we have our map initiated and our colormap is already set, we’ll use folium.GeoJson to add a layer to the map. Within this method we’ll

  • Point it to our Geopandas dataframe.
  • Use style_function to color the regions based on the colormap and variable value
  • Identify the fields that we want the user to see in the tooltip when hovering using highlight_function and tooltip.
In [42]:
folium.GeoJson(br, 
              name='10/05/2020', 
              style_function=lambda x: {'fillColor': colormap(x['properties']['cumcases']), 
                                        'color': 'black',
                                        'fillOpacity':0.7,
                                        'weight': 1}, 
              highlight_function=lambda x: {'weight':1, 'color':'black', 'fillOpacity':1},
              tooltip=folium.features.GeoJsonTooltip(fields=['state', 'initials', 'cumcases'], 
                                                    aliases=['State:', 'Initials:', 'Cumulative Cases:'])).add_to(cumcasesmap)
colormap.add_to(cumcasesmap)
Out[42]:
36245444

The Cummulative Cases Map

In [44]:
# Calling the object the holds our map.

cumcasesmap
Out[44]:
Make this Notebook Trusted to load map: File -> Trust Notebook

Step 4 - Initialising Cummulatitive Deaths Map

In this part, we're going to repeat the same process described in the previous step using cummulative deaths instead of cummulative cases.

In [61]:
# Centering the map at the same coordinates "c" and iniciating our map.

cumdeathsmap = folium.Map(width= 600, heigth= 400, location= c, zoom_start= 4, max_zoom= 5, tiles= 'cartodbpositron')

Creating a Colormap

In [62]:
# Creating a Colormap to Cumulative Deaths

colormap2 = LinearColormap(colors= ['white', 'pink', 'red'], 
                          index= [11, 300, 3709], vmin= 11, vmax= 3709)
colormap2.caption = 'COVID-19 Cummulative Deaths May/10'

Adding Layers

In [63]:
folium.GeoJson(br, name= '10/05/2020', 
               style_function= lambda x: {'fillColor': colormap2(x['properties']['cumdeaths']), 
                                         'color': 'black', 
                                         'fillOpacity': 0.7,
                                         'weight': 1},
              highlight_function= lambda x: {'weight': 1, 'color': 'black', 'fillOpacity': 1},
              tooltip=folium.features.GeoJsonTooltip(fields= ['state', 'initials', 'cumdeaths'],
                                                    aliases= ['State:', 'Initials:', 'Cumulative Deaths:'])).add_to(cumdeathsmap)
colormap2.add_to(cumdeathsmap)
Out[63]:
113709

The Cummulative Deaths Map

In [64]:
# Calling the object the holds our map.

cumdeathsmap
Out[64]:
Make this Notebook Trusted to load map: File -> Trust Notebook

That’s all for today! If you’d like to take a look at another project, fell free to check-out my github portfolio at https://github.com/fsolares/professional-portfolio